The Rademacher Complexity of Linear Transformation Classes
نویسنده
چکیده
Bounds are given for the empirical and expected Rademacher complexity of classes of linear transformations from a Hilbert space H to a nite dimensional space. The results imply generalization guarantees for graph regularization and multi-task subspace learning. 1 Introduction Rademacher averages have been introduced to learning theory as an e¢ cient complexity measure for function classes, motivated by tight, sample or distribution dependent generalization bounds ([10], [2]). Both the de nition of Rademacher complexity and the generalization bounds extend easily from realvalued function classes to function classes with values in R, as they are relevant to multi-task learning ([1], [12]). There has been an increasing interest in multi-task learning which has shown to be very e¤ective in experiments ([7], [1]), and there have been some general studies of its generalisation performance ([4], [5]). For a large collection of tasks there are usually more data available than for a single task and these data may be put to a coherent use by some constraint of relatedness. A practically interesting case is linear multi-task learning, extending linear large margin classi ers to vector valued large-margin classi ers. Di¤erent types of constraints have been proposed: Evgeniou et al ([8], [9]) propose graph regularization, where the vectors de ning the classi ers of related tasks have to be near each other. They also show that their scheme can be implemented in the framework of kernel machines. Ando and Zhang [1] on the other hand require the classi ers to be members of a common low dimensional subspace. They also give generalization bounds using Rademacher complexity, but these bounds increase with the dimension of the input space. This paper gives dimension free bounds which apply to both approaches. 1.1 Multi-task generalization and Rademacher complexity Suppose we have m classi cation tasks, represented by m independent random variables X ; Y l taking values in X f 1; 1g, where X l models the random
منابع مشابه
Rounding Methods for Discrete Linear Classification(Extended Version)
Learning discrete linear classifiers is known as a difficult challenge. In this paper, this learning task is cast as combinatorial optimization problem: given a training sample formed by positive and negative feature vectors in the Euclidean space, the goal is to find a discrete linear function that minimizes the cumulative hinge loss of the sample. Since this problem is NP-hard, we examine two...
متن کاملRounding Methods for Discrete Linear Classification
Learning discrete linear classifiers is known as a difficult challenge. In this paper, this learning task is cast as combinatorial optimization problem: given a training sample formed by positive and negative feature vectors in the Euclidean space, the goal is to find a discrete linear function that minimizes the cumulative hinge loss of the sample. Since this problem is NP-hard, we examine two...
متن کاملRademacher Complexity Margin Bounds for Learning with a Large Number of Classes
This paper presents improved Rademacher complexity margin bounds that scale linearly with the number of classes as opposed to the quadratic dependence of existing Rademacher complexity margin-based learning guarantees. We further use this result to prove a novel generalization bound for multi-class classifier ensembles that depends only on the Rademacher complexity of the hypothesis classes to ...
متن کاملLecture 6: Rademacher Complexity
In this lecture, we discuss Rademacher complexity, which is a different (and often better) way to obtain generalization bounds for learning hypothesis classes.
متن کاملRademacher Margin Complexity
where σ1, ...σn are iid Rademacher random variables. Rn(F ) characterizes the extent to which the functions in F can be best correlated with a Rademacher noise sequence. A number of generalization error bounds have been proposed based on Rademacher complexity [1,2]. In this open problem, we introduce a new complexity measure for function classes. We focus on function classes F that is the conve...
متن کامل